NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

What Can an Accent Identifier Learn? Probing Phonetic and Prosodic Information in a Wav2vec2-based Accent Identification Model

https://doi.org/10.21437/Interspeech.2023-2254

Yang, Mu; Shekar, Ram C.; Kang, Okim; Hansen, John H. (August 2023, ISCA INTERSPEECH-2023)
N/A (Ed.)
This study is focused on understanding and quantifying the change in phoneme and prosody information encoded in the Self-Supervised Learning (SSL) model, brought by an accent identification (AID) fine-tuning task. This problem is addressed based on model probing. Specifically, we conduct a systematic layer-wise analysis of the representations of the Transformer layers on a phoneme correlation task, and a novel word-level prosody prediction task. We compare the probing performance of the pre-trained and fine-tuned SSL models. Results show that the AID fine-tuning task steers the top 2 layers to learn richer phoneme and prosody representation. These changes share some similarities with the effects of fine-tuning with an Automatic Speech Recognition task. In addition, we observe strong accent-specific phoneme representations in layer 9. To sum up, this study provides insights into the understanding of SSL features and their interactions with fine-tuning tasks.
more » « less
Full Text Available
Assessment of Non-Native Speech Intelligibility using Wav2vec2-based Mispronunciation Detection and Multi-level Goodness of Pronunciation Transformer

https://doi.org/10.21437/Interspeech.2023-2371

Shekar, Ram C.; Yang, Mu; Hirschi, Kevin; Looney, Stephen; Kang, Okim; Hansen, John H. (August 2023, ISCA INTERSPEECH-2023)
N/A (Ed.)
Automatic pronunciation assessment (APA) plays an important role in providing feedback for self-directed language learners in computer-assisted pronunciation training (CAPT). Several mispronunciation detection and diagnosis (MDD) systems have achieved promising performance based on end-to-end phoneme recognition. However, assessing the intelligibility of second language (L2) remains a challenging problem. One issue is the lack of large-scale labeled speech data from non-native speakers. Additionally, relying only on one aspect (e.g., accuracy) at a phonetic level may not provide a sufficient assessment of pronunciation quality and L2 intelligibility. It is possible to leverage segmental/phonetic-level features such as goodness of pronunciation (GOP), however, feature granularity may cause a discrepancy in prosodic-level (suprasegmental) pronunciation assessment. In this study, Wav2vec 2.0-based MDD and Goodness Of Pronunciation feature-based Transformer are employed to characterize L2 intelligibility. Here, an L2 speech dataset, with human-annotated prosodic (suprasegmental) labels, is used for multi-granular and multi-aspect pronunciation assessment and identification of factors important for intelligibility in L2 English speech. The study provides a transformative comparative assessment of automated pronunciation scores versus the relationship between suprasegmental features and listener perceptions, which taken collectively can help support the development of instantaneous assessment tools and solutions for L2 learners.
more » « less
Full Text Available
Improving Mispronunciation Detection with Wav2vec2-based Momentum Pseudo-Labeling for Accentedness and Intelligibility Assessment

https://doi.org/10.21437/Interspeech.2022-11039

Yang, Mu; Hirschi, Kevin; Looney, Stephen Daniel; Kang, Okim; Hansen, John H.L. (September 2022, Interspeech)

Full Text Available
Joint Hypoglycemia Prediction and Glucose Forecasting via Deep Multi-Task Learning

https://doi.org/10.1109/ICASSP43922.2022.9746129

Yang, Mu; Dave, Darpit; Erraguntla, Madhav; Cote, Gerard L.; Gutierrez-Osuna, Ricardo (May 2022, ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP))

We present a multitask learning approach to the problem of hypoglycemia (HG) prediction in diabetes. The approach is based on a state-of-the-art time series forecasting model, N-BEATS, and extends it by adding a classification task so that the model performs both glucose forecasting (i.e., predicting future glucose values) and HG prediction (i.e., probability of future HG events sometime within the prediction horizon). We also propose an alternative loss function that penalizes forecasting errors in the HG range. We evaluate the approach on a dataset containing over 1.6M recordings from 112 patients with type 1 diabetes who wore a continuous glucose monitor (CGM) for 90 days. Our results show that the classification branch significantly outperforms the forecasting branch on the problem of HG prediction, and that the new loss function is more effective at reducing forecasting errors in the HG range than multi-task learning.
more » « less
Full Text Available
Measurement of the relative non-degenerate two-photon absorption cross-section for fluorescence microscopy

https://doi.org/10.1364/OE.27.008335

Sadegh, Sanaz; Yang, Mu-Han; Ferri, Christopher G.; Thunemann, Martin; Saisan, Payam A.; Devor, Anna; Fainman, Yeshaiahu (January 2019, Optics Express)

Full Text Available
Plasmonic enhanced two-photon absorption in silicon photodetectors for optical correlators in the near-infrared

https://doi.org/10.1364/OL.41.004445

Smolyaninov, Alexei; Yang, Mu-Han; Pang, Lin; Fainman, Yeshaiahu (September 2016, Optics Letters)
Electronic Metamaterials with Tunable Second-order Optical Nonlinearities

https://doi.org/10.1038/s41598-017-10304-2

Lin, Hung-Hsi; Vallini, Felipe; Yang, Mu-Han; Sharma, Rajat; Puckett, Matthew W.; Montoya, Sergio; Wurm, Christian D.; Fullerton, Eric E.; Fainman, Yeshaiahu (December 2017, Scientific Reports)

Full Text Available
Neurophotonic Tools for Microscopic Measurements and Manipulation: Status Report

https://doi.org/10.1117/1.NPh.9.S1.013001

Abdelfattah, Ahmed; Allu, Srinivasa Rao; Campbell, Robert E.; Cheng, Xiaojun; Cižmár, Tomáš; Costantini, Irene; Emiliani, Valentina; Fomin-Thunemann, Natalie; Gilad, Ariel; Fernández Alfonso, Tomás; et al (January 2022, Neurophotonics)

Full Text Available
Synthesis of second-order nonlinearities in dielectric-semiconductor-dielectric metamaterials

https://doi.org/10.1063/1.4978640

Lin, Hung-Hsi; Yang, Mu-Han; Sharma, Rajat; Puckett, Matthew W.; Montoya, Sergio; Wurm, Christian D.; Vallini, Felipe; Fullerton, Eric E.; Fainman, Yeshaiahu (March 2017, Applied Physics Letters)

Search for: All records